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EXECUTIVE  SUMMARY 


A.  BACKGROUND  AND  SCOPE 

The  Vehicular  Mounted  Mine  Detector  (VMMD)  ATD  demonstration  occurred  in 
June  1998  at  Aberdeen  Proving  Ground,  Maryland,  and  in  July  1998  at  Socorro,  New 
Mexico.  It  was  decided  that  it  would  be  beneficial  to  devote  a  small  amount  of  that  time 
to  conducting  a  series  of  tests  specific  to  the  Thermal  Neutron  Analysis  (TNA)  detector 
being  used  by  Computing  Devices  Canada  (CDC).  In  April  1998,  a  TNA-specific  test 
plan  was  devised  to  address  performance  issues  in  a  more  thorough  and  systematic 
manner  than  had  been  done  in  the  past.  Unfortunately,  there  were  sufficient  constraints 
such  that  this  test  plan  could  not  be  implemented  as  designed.  Instead,  a  much  abridged 
version  was  conducted  at  Aberdeen  Proving  Ground.  At  Socorro,  the  test  that  was 
actually  conducted  yielded  only  a  PD  /PFA  value  and  hence  was  not  scientifically  inter¬ 
esting.  Thus,  in  this  report  we  focus  most  of  our  attention  on  the  Aberdeen  results. 

B.  TEST  RESULTS 

1.  Aberdeen  Proving  Ground,  Maryland 

The  main  results  from  the  APG  test  are  as  follows: 

•  CDC  obtained  a  PD  of  63  percent  (12  out  of  19  mines  detected)  and  a  PFA  of  0 
percent  (0  out  of  22)  using  a  threshold  value  CDC  selected  for  the  decision 
criterion.  This  threshold  was  nonoptimal;  the  receiver  operating  characteristic 
(ROC)  curve  (see  Figure  ES-1)  indicates  that  a  PD  of  79  percent  (15  out  of  19 
mines)  with  a  PFA  of  0  percent  was  possible  with  the  optimal  choice  of  the 
threshold.  The  ROC  performance  curve  was  insensitive  to  alternative  decision 
criteria. 

•  Surface  mines  were  more  likely  to  be  detected  than  buried  mines,  although 
one  surface  mine  went  undetected. 

•  Performance  showed  no  dependence  on  depth. 

•  Performance  showed  no  dependence  on  mine  type  (or  nitrogen  content). 

•  Three  mines  yielded  very  low  signatures:  an  M19  at  1.5  in.,  a  TM62P  on  the 
surface,  and  a  TM46  at  2  in. 
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TNA  Test  Results 


Figure  ES-1.  ROC  Curves  for  Aberdeen  and  Socorro  Tests 


•  It  is  possible  that  the  square  shape  of  an  M19  renders  it  more  difficult  to 
detect.  An  M 19  on  the  surface  yielded  the  second  weakest  signal  of  all  the 
surface  mines;  its  signal  was  even  weaker  than  those  of  a  TM62M  and  a 
TMA4,  both  of  which  contain  significantly  less  nitrogen  than  an  M19. 
Furthermore,  during  a  separate  reproducibility  test,  an  M19  at  1.5  in.  twice 
yielded  a  significantly  weaker  signal  than  an  M 15  at  1.5  in.,  even  though  the 
Ml 5  contains  only  10  percent  more  nitrogen  than  an  Ml 9. 

•  Target  variability  played  a  larger  role  than  background  variability  in  detection 
rate. 

•  Test  execution  shortfalls  preclude  resolution  of  key  issues. 

2.  Socorro,  New  Mexico 

As  noted  above,  quantitative  analysis  was  essentially  impossible  given  the  lack  of 
data  obtained  at  Socorro.  The  only  result  that  can  be  reported  is  a  PD  of  100  percent  (19 
out  of  19  mines  detected)  and  a  PFA  of  32  percent  (6  out  of  19). 

C.  CONCLUSIONS  AND  RECOMMENDATIONS 

The  Aberdeen  Proving  Ground  tests  yielded  several  interesting  results.  First, 
target  signature  variability  dominated  the  test  results.  This  was  surprising,  given  that 
usually  the  background  phenomenology  dominates  TNA  performance.  Second,  target 


ES-2 


shape  may  be  an  important  driver,  as  indicated  by  the  relatively  poor  detectability  of  the 
Ml 9.  Finally,  that  the  detectability  of  a  mine  did  not  seem  to  depend  on  either  its  burial 
depth  or  its  nitrogen  content  is  counterintuitive.  Unfortunately,  conclusions  about  these 
results  can  only  be  drawn  with  extreme  cautionj  it  may  be  that  some  or  all  are  simply  an 
artifact  of  an  extremely  small  data  set.  Aspects  of  the  test  plan  that  were  intended  to  shed 
light  on  exactly  these  types  of  issues  were  not  completed.  We  believe  that  it  is  in  the 
Army’s  interest  to  resolve  these  issues  with  a  more  thorough  test,  such  as  the  one  detailed 
in  Appendix  A  of  this  report. 
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I.  INTRODUCTION 


The  Vehicular  Mounted  Mine  Detector  (VMMD)  ATD  demonstration  occurred  in 
June  1998  at  the  Aberdeen  Proving  Ground  (APG),  Maryland,  and  in  July  1998  at 
Socorro,  New  Mexico.  It  was  decided  that  it  would  be  beneficial  to  devote  a  small 
amount  of  time  during  the  demonstration  to  a  series  of  tests  specific  to  the  Thermal 
Neutron  Analysis  (TNA)  detector  being  used  by  Computing  Devices  Canada  (CDC).  The 
primary  goal  of  these  TNA-specific  tests  was  to  address  performance  issues  in  a  more 
thorough  and  systematic  manner  than  had  been  done  in  the  past.  In  April  1998,  a  test  plan 
was  devised  to  address  four  major  objectives:  (1)  the  amount  of  soil  content  variability  to 
be  expected  on-site;  (2)  measurement  reproducibility,  on  both  a  short  and  long  time  scale; 
(3)  the  spatial  response  of  the  device;  and  (4)  the  generation  of  performance  curves,  or 
receiver  operating  characteristic  (ROC)  curves  (PD  vs.  PFA),  for  different  mine  types  and 
depths. 

The  first  objective  is  important  for  two  reasons:  first,  soil  content  variability  on  a 
small  scale  renders  background  subtraction  difficult;  and  second,  large  concentrations  of 
certain  isotopes  can  cause  problems  directly  or  indirectly.  For  example,  a  high  nitrogen 
concentration  in  the  soil  can  interfere  directly  with  the  TNA  detector’s  ability  to  detect 
the  y-rays  emitted  by  nitrogen  in  the  explosive.  In  addition,  y-rays  generated  by  neutron 
capture  by  several  isotopes  can  cause  pile-up  in  the  detectors.  The  second  objective 
addresses  the  “system  noise  level”  of  the  device.  The  third  objective  addresses  the  fact 
that  TNA  is  a  confirmatory  sensor  and  as  such,  it  will  be  cued  by  other  sensors.  The 
spatial  accuracy  of  the  cuing  device  must  therefore  be  compatible  with  the  spatial 
response  of  the  TNA  device.  The  fourth  objective  provides  an  understanding  of  how 
performance  depends  on  nitrogen  content  and  depth. 

Appendix  A  details  this  test  plan.  Unfortunately,  there  were  sufficient  constraints 
such  that  this  test  plan  could  not  be  implemented  as  designed.  Instead,  a  much  abridged 
version  was  conducted  at  APG.  At  Socorro,  the  test  actually  conducted  yielded  only  a 
PD  /PFA  value  and  hence  was  not  scientifically  interesting.  Thus,  we  focus  most  of  our 
attention  on  the  Aberdeen  results  in  this  report. 

As  will  be  seen  in  the  Test  Implementation  section,  only  three  of  the  four  objec¬ 
tives  of  our  original  test  plan  were  even  attempted.  The  spatial  response  test  was  not 
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conducted.  Further,  the  reproducibility  test  addressed  only  the  short  time  scale,  and  even 
then,  it  was  insufficient.  For  the  performance  curves,  only  41  targets  were  measured,  19 
of  which  were  mines.  Thus,  although  we  present  a  ROC  curve  as  well  as  figures 
displaying  performance  dependence  on  nitrogen  content  and  depth,  one  must  be  very 
careful  not  to  draw  too  many  conclusions  from  such  a  limited  data  set. 

For  a  detailed  description  of  the  TNA  system  employed  by  CDC,  the  reader  is 
referred  to  Dr.  John  McFee  at  the  Defence  Research  Establishment,  Suffield.  In  brief,  the 
TNA  sensor  consists  of  an  isotopic  Cf252  neutron  source  and  a  moderator  that  slows  the 
neutrons  down  to  near-thermal  energies  before  they  penetrate  the  ground.  Four  Nal 
detectors  surrounding  the  source  detect  the  10.8  MeV  y-rays  emitted  by  the  nitrogen  in 
the  explosive.  The  detection  window  extends  from  10.05  MeV  to  11.30  MeV,  so  that  a 
significant  number  of  counts  in  the  window  are  due  to  contributions  from  background 
rather  than  from  the  nitrogen  in  the  explosive.  This  background  contribution  is  subtracted 
off  by  using  background  spectra  collected  in  places  where  it  is  known  that  no  mine  is 
present.  Contributions  to  this  background  include  a  10.6  MeV  y-ray  generated  by  Si  in 
the  soil,  pile-up  resulting  from  y-rays  produced  by  neutron  capture  off  various  elements 
in  the  soil,  fast-neutron  capture  in  the  Nal  detectors,  and  cosmic  rays. 


1-2 


II.  TEST  IMPLEMENTATION 


Due  to  time  and  other  constraints,  the  full  TNA  test  plan  detailed  in  Appendix  A 
could  not  be  implemented  during  the  VMMD  tests.  Here,  we  describe  the  tests  that  were 
actually  implemented  at  APG  and  at  Socorro.  Table  II- 1  shows  the  accomplishments 
relative  to  the  goals. 


Table  11-1.  Test  Accomplishments  Relative  to  Goals 


Category 

Goal 

Aberdeen 

Socorro 

Soil  analysis 

minerals 

CNOH 

water 

density 

minerals 

CNO 

CNO 

Reproducibility 

20 

9 

0 

Spatial  Response 

100 

0 

0 

Declarations 

80 

41 

39 

Confidence  Measure 

80 

41 

0 

Spectra 

80 

0 

0 

A.  ABERDEEN  PROVING  GROUND 


On  the  morning  of  17  June  1998,  CDC  completed  a  significantly  scaled-down 
version  of  a  “repeatability”  test.  Four  locations  were  marked  in  Lane  1 1  with  both  painted 
crosses  and  golf  tees.  At  each  location,  CDC  took  measurements  for  2  minutes.  They  then 
moved  forward  several  meters  beyond  the  fourth  location,  backed  up,  and  repeated  the 
measurements  at  each  location  in  reverse  order.  Thus,  a  total  of  eight  measurements  were 
made,  two  at  each  of  the  four  locations.1  Although  it  is  not  possible  to  draw  any 
quantitative  conclusions  about  the  repeatability  of  the  TNA  platform  from  this  data  set, 
we  offer  quantitative  conclusions  in  the  Test  Results  section.  Figure  II- 1  shows  the 
ground  truth  of  this  test.  An  M15  was  buried  1.5  in.  below  Location  1.  At  Location  2,  a 
large  object  had  been  dug  out  and  the  soil  replaced,  thereby  creating  an  area  less  dense 
than  the  surrounding  soil.  Location  3  was  undisturbed  soil.  An  Ml 9  was  buried  1.5  in. 
below  Location  4. 


1  Location  1  was  actually  measured  three  times,  yielding  a  total  of  nine  measurements. 
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On  the  morning  of  June  19,  1998,  CDC  completed  measurements,  each  lasting 
2  minutes,  at  41  specified  locations.  The  time  required  to  complete  the  test  was  about 
3.5  hours.  This  time  includes  the  time  required  to  move  and  align  the  system,  as  well  as 
that  required  to  repeat  several  background  and  calibration  measurements.  Of  the  41  sites, 
19  were  mines  of  various  nitrogen  content  and  depth  (see  Table  II-2).  The  last  column  of 
Table  II-2  indicates  how  many  of  each  mine  were  detected  by  CDC.  Although  41  data 
points  is  not  a  sufficent  number  to  accurately  characterize  the  dependence  of  system 
performance  on  nitrogen  content  and  depth,  we  draw  some  conclusions  and  present  them, 
along  with  the  ROC  curve,  in  the  Test  Results  section. 


X* 
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Figure  11-1.  Repeatability  Test  Layout 


Table  11-2.  Inventory  of  Mines  for  ROC  Curve  Test 


Mine 

Type 

Depth 

Nitrogen  Content 

(kg) 

Quantity 

Number  of 

CDC  Detects 

M15 

Surface 

3.08 

2 

2 

Ml  5 

1 .5  in. 

3.08 

3 

3 

Ml  9 

Surface 

2.85 

1 

1 

M19 

1 .5  in. 

2.85 

3 

1 

TM62M 

Surface 

1.58 

2 

2 

TM62P 

Surface 

1.33 

1 

0 

TM62P 

3  in. 

1.33 

2 

1 

TM62M 

4  in. 

1.58 

2 

1 

TM46 

2  in. 

1.06 

1 

0 

TMA4 

Surface 

1.02 

1 

1 

TMA4 

2  in. 

1.02 

1 

0 

It  should  be  noted  that  at  both  Aberdeen  and  Socorro  the  marked  coordinates  of 
the  mines  correspond  to  the  center  of  the  mine;  there  is  no  offset  in  the  designation 
relative  to  it.  Therefore,  the  survey  should  be  accurate  to  within  a  few  centimeters.  Any 
offset  in  the  placement  of  the  apparatus  relative  to  the  mine  is  a  result  of  the  manual 
alignment  of  the  TNA  system,  and  is  presumably  small.  Further,  for  those  mines  not  on 
the  surface,  mines  of  a  given  type  are  buried  at  the  same  depth.  So  to  first  order,  the 
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attenuation  of  neutrons  and  y  rays  is  the  same  for  mines  of  a  given  type.  Finally,  all  mines 
of  a  given  type  are  presumed  to  have  fixed  composition,  especially  with  respect  to 
nitrogen  content. 

B.  SOCORRO 

On  July  23,  1998,  CDC  completed  a  truncated  test  of  the  TNA  system.  Neither 
the  planned  reproducibility  test  nor  the  spatial  response  test  was  attempted.  Moreover, 
due  to  the  absence  of  key  personnel,  only  yes/no  responses  were  generated,  instead  of 
confidence  reports.  Thus,  very  little  analysis  of  the  test  results  is  possible. 

The  test  was  conducted  on  lanes  11,  12,  and  13  at  the  Socorro  facility.  The  test 
consisted  of  measurements  taken  for  2  minutes  at  each  of  38  designated  locations.  At 
9:00  a.m.,  the  initial  calibration  measurement  was  completed  and  the  first  background 
measurement  started.  The  test  was  completed  at  12:22  p.m.  Thus,  the  total  time  required 
per  test  measurement  was  5  minutes  and  20  seconds.  This  includes  the  time  to  move  and 
align  the  system,  as  well  as  overhead  for  several  background  measurements  and 
recalibrations. 

Of  the  38  designated  locations,  19  locations  corresponded  to  buried  mines 
containing  explosive  charges.  Table  II-3  gives  the  mine  inventory  for  the  Socorro  test. 
This  inventory  is  similar  to  that  of  the  Aberdeen  test,  except  that  no  surface  mines  were 
included.  The  other  19  locations  were  at  least  4  m  from  any  buried  mine,  with  or  without 
explosives. 


Table  11-3.  Inventory  of  Mines  for  the  Socorro  Test 


Mine 

Type 

Depth 

(in.) 

Nitrogen  Content 

(kg) 

Quantity 

Number  of 

CDC  detects 

M15 

1.5 

3.08 

4 

4 

M19 

1.5 

2.85 

3 

3 

TM62M 

4 

1.58 

3 

3 

TM62P 

3 

1.33 

3 

3 

TM46 

2 

1.06 

3 

3 

TMA4 

2 

1.02 

3 

3 
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in.  TEST  DATA 


We  have  scored  and  analyzed  the  set  of  response  data  provided  by  CDC.  In  this 
section  we  explain  the  meaning  of  the  data,  document  CDC  terminology,  and  explain 
various  checks  that  we  have  performed  to  ensure  that  the  data  set  is  self  consistent  and 
that  our  interpretations  are  correct. 

The  procedure  outlined  here  has  been  applied  to  the  entire  database.  The  single 
test  measurement  shown  in  Figure  III-l  will  serve  as  an  example  for  this  discussion.  Note 
there  are  four  independent  y-ray  detection  subsystems,  or  “channels,”  labeled  0...3,  in  the 
system;  the  “sum”  channel  is  a  function  of  the  four  independent  channels. 


- 

>Target 

=abecal01 

302  ab3  02  t- 

=120  sec 

Bckgnd 

abbgOl  t=300  sec 

Energy  cal 

> 

>Chan  # 

counts 

net  counts 

bckgnd 

counts 

var  net  counts 

var  bckgnd 

>  0 

29.9 

105.1 

177.8 

42.8 

>  1 

102.0 

108.0 

253.0 

43.0 

>  2 

25.8 

61.2 

112.0 

25.0 

>  3 

19.7 

83.3 

137.2 

34.2 

> 

>Channel 

Net 

Std  Net 

P (alpha) 

>  0 

29.91 

13.33 

0.012 

>  1 

102.03 

15.91 

0.000 

>  2 

25.82 

10.58 

0.007 

>  3 

19.73 

11.71 

0.046 

>  sum 

177.49 

26.08 

0.000 

Figure  111-1.  A  Sample  Test  Data  Record  as  Provided  by  CDC 

A  brief  description  of  the  measurement  protocol  will  help  to  understand  and 
interpret  the  data  shown  here.  There  are  three  types  of  measurements  that  are  taken  in  the 
course  of  the  test:  calibration,  background,  and  test  measurements.  Each  measurement 
yields  an  energy  spectrum  of  y-rays.  The  purpose  and  interpretation  of  each  measurement 
follows. 

•  At  the  beginning  of  the  test,  and  periodically  thereafter,  a  standard  target  is 
used  to  collect  an  energy  calibration  measurement.  This  calibration  is  used  to 
monitor  and  compensate  for  any  drift  in  system  gain  that  might  be  expected 
(for  example,  from  changes  in  the  temperature  of  the  photomultiplier  tubes). 
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The  spectrum  determines  the  limits  on  the  pulse  height  window  corre¬ 
sponding  to  the  location  of  the  10.8  MeV  nitrogen  peak.  This  window 
determines  the  interval  over  which  the  spectra  will  be  summed  in  subsequent 
measurements.  The  procedure  is  applied  upstream  of  the  data  reported  here, 
and  will  not  be  discussed  further. 

•  At  the  beginning  of  the  test,  and  periodically  thereafter,  a  5-minute  back¬ 
ground  measurement  is  made  at  a  location  where  there  is  no  mine.  This 
measurement  is  used  to  compensate  for  any  long-range  (on  a  scale  of  several 
tens  of  meters)  variations  in  the  soil.  This  spectrum  will  be  normalized  to, 
and  subtracted  from,  the  subsequent  test  measurements. 

•  A  test  measurement  is  made  by  counting  for  2  minutes  at  each  designated  test 
location. 

Thus,  the  purpose  of  the  calibration  and  background  measurements  is  to  properly 
correct  each  test  measurement.  For  each  test  measurement,  there  are  essentially  three 
independent  numbers  for  each  of  the  four  channels: 

•  N,  the  integer  number  of  counts  in  the  nitrogen  window  for  the  2-minute  test 
measurement; 

•  NB,  the  integer  number  of  counts  in  the  same  window  from  the  previous 
5-minute  background  measurement;  and 

•  c,  the  normalization  constant  determined  by  matching  the  test  spectrum  with 
the  previous  background  spectrum  in  some  region  outside  the  nitrogen 
window.  This  number  should  be  close  to  0.4,  the  ratio  of  the  counting  times, 
if  the  soil  properties  do  not  vary. 

The  data  provided  by  CDC  do  not  contain  these  raw  numbers,  but  intermediate 
results  based  on  them.  All  of  the  numbers  in  Figure  ID-1  can  be  derived  from  N,  NB,  and 
c;  however,  we  can  invert  the  process  and  back  them  out  from  the  data  provided.  The 
bullets  that  follow  define  the  relationship  between  the  raw  data  and  the  data  provided. 
Refer  to  Channel  0  in  the  data  record  above  for  the  numbers  presented  here. 

•  net  counts  =  N  -  c*NB  and  bckgnd  counts  =  c*NB,  so 
N  =  net  counts  +  bckgnd  counts  =  135.0,  an  integer. 

•  bckgnd  counts  =  c*NB  and  var  bckgnd  counts  =  cA2*NB,  so 
c  =  var  bckgnd  counts/bckgnd  counts  =  0.407  ~  0.4,  as  suspected. 
[This  follows  from  the  fact  that  for  the  Poisson  distribution,  var(COUNT)  = 
COUNT,  and  so  var(const*  COUNT)  =  constA2*  COUNT.] 

•  Further,  NB  =  bckgnd  countsA2/var  bckgnd  counts  =  258.08,  an 
integer  to  the  precision  quoted. 

•  As  a  check,  observe  that  var  net  counts  =  N  +  cA2*NB,  as  it  must. 
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•  Net  is  just  net  counts  to  better  precision. 

•  Std  Net  =  SQRT(var  net  counts),  as  it  must. 

•  P  (alpha)  is  1  minus  the  cumulative  normal  distribution  of  Net/Std 
Net.  It  is  the  probability  that  the  null  hypothesis  (“no  mine  here!”)  is 
rejected. 

Thus,  the  ratio  of  Net/Std  Net  is  being  interpreted  as  the  z-score  based  on 
counting  statistics.  The  implicit  assumption  is  that  in  the  absence  of  mines  the  variation 
in  this  quantity  should  be  dominated  by  counting  statistics.  If  this  assumption  is  correct, 
this  set  of  numbers,  corresponding  to  the  locations  where  there  is  no  mine  present,  should 
have  zero  mean  and  unit  standard  deviation. 

Now  refer  to  the  row  labeled  sum: 

•  Net  is  the  sum  of  the  four  numbers  above  it. 

•  Std  Net  is  the  sum  in  quadrature  of  the  four  numbers  above  it. 

•  P  (alpha)  is  1  minus  the  cumulative  normal  distribution  of  Net/Std 
Net,  as  before. 

It  is  noted  elsewhere  in  the  CDC  database  that  the  decision  criterion,  called  “Pmn”, 
is  derived  by  taking  the  minimum  of  the  five  numbers  in  the  P  (alpha )  column.  For  this 
test,  the  threshold  value  for  P^n  was  0.001;  below  that  value  a  mine  was  declared,  and 
above  it  no  mine.  In  the  analysis,  we  find  that  this  decision  criterion  gives  performance 
that  is  not  significantly  different  from  other  reasonable  criteria  based  on  this  data. 

So,  to  summarize,  each  row  of  the  database  can  be  derived  from  three  numbers:  N 
is  the  number  of  counts  at  the  given  location  in  the  nitrogen  window.  NB  is  the  number  of 
counts  in  the  background  spectrum  in  the  same  window;  this  number  only  changes  when 
a  new  background  or  calibration  spectrum  measurement  is  taken,  c  is  the  normalization 
constant;  it  changes  based  on  spectral  counts  outside  the  nitrogen  window,  but  is  always 
within  a  few  percent  of  the  ratio  of  the  counting  times  for  N  and  NB. 
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IV.  TEST  RESULTS 


A.  ABERDEEN  PROVING  GROUND 
1.  Repeatability  Test 

Table  IV- 1  gives  the  main  results  of  the  CDC  TNA  repeatability  test.  The 
parameter  “P^”  is  the  statistic  used  by  CDC  to  determine  whether  a  mine  is  present.  It  is 
essentially  1  minus  the  cumulative  normal  distribution  of  the  net  counts  divided  by  the 
standard  deviation  of  those  counts,  or  the  probability  that  the  null  hypothesis  is  not 
rejected.  The  “min”  refers  to  the  fact  that  this  value  is  calculated  in  each  of  the  four 
detectors,  so  that  a  detection  is  declared  if  P  is  less  than  a  threshold  value  in  any  one 
detector  or  in  the  summed  channel.  (The  threshold  value  for  the  repeatability  test  was 
0.02.)  Pnun  is  strongly  correlated  with  the  the  total  net  counts  (summed  over  the  four 
detectors)  divided  by  the  total  background  counts.  This  statistic  is  labeled  “net/bckgnd” 
in  Table  IV- 1. 

Although  a  test  yielding  only  two  measurements  per  location  does  not  allow  for  a 
quantitative  assessment  of  the  repeatability  of  the  TNA  system,  the  assumption  that  the 
system  yields  repeatable  results  is  not  inconsistent  with  the  results  displayed  in 
Table  IV- 1. 


Table  IV-1.  Results  of  Repeatability  Test 


Location 

1 

1 

1 

2 

2 

3 

3 

4 

4 

Run  # 

1 

2 

3* 

i 

2 

i 

2 

i 

2 

Back 

counts 

396.6 

374.8 

383.4 

396.3 

375.7 

392.2 

373.60 

384.6 

383.1 

Net  counts 

102.38 

117.27 

129.62 

-34.33 

-20.67 

27.90 

17.55 

53.39 

68.96 

Net/bckgnd 

0.258 

0.313 

0.338 

-0.0866 

-0.0550 

0.071 1 

0.0470 

0.139 

0.180 

P  . 

r  mm 

0.000 

0.000 

0.000 

0.567 

0.359 

0.123 

0.219 

0.014 

0.003 

Mine?  (Y/N) 

Y 

Y 

Y 

N 

N 

N 

N 

Y 

Y 

*  Location  1  was  measured  three  times. 


Two  results  from  this  test  merit  comment.  First,  while  the  M 15  and  M19  contain 
very  similar  amounts  of  nitrogen,  and  they  were  buried  at  the  same  depth,  the  M15  signal 
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was  significantly  stronger  than  that  of  the  M19.  Taken  alone,  this  result  may  not  be 
statistically  significant,  but  a  similar  trend  emerged  in  the  ROC  curve  test  results  as  well 
(see  below).  Second,  the  TNA  system  indicated  a  significant  difference  between  locations 
2  and  3.  Specifically,  the  net  signal  from  location  2  was  found  to  be  much  lower  than  that 
of  location  3,  consistent  with  the  fact  that  location  2  was  a  hole  that  had  been  filled  and 
was  therefore  less  dense  than  location  3. 

2.  Receiver  Operating  Characteristic  Curves  and  Decision  Criteria 

As  discussed  in  the  Test  Data  section,  the  CDC  team  constructed  variables  called 
“P-statistics”  to  determine  whether  a  mine  was  present.  The  CDC  decision  criterion  is  the 
Pmin  statistic.  This  is  simply  the  minimum  of  five  P  statistics:  one  for  each  of  the  four 
y-ray  detectors  in  the  system  and  one  for  the  sum  of  all  four.  Figure  IV- 1  shows  the  ROC 
curve  results  using  the  Pmin  statistic  as  well  as  two  alternative  statistics  discussed  below. 


Figure  IV-1.  ROC  Curves  Corresponding  to  Alternative  Decision 
Criteria  for  the  Aberdeen  Data 
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The  CDC  team  chose  a  nonoptimal  threshold  value  of  Pmin,  and  hence  they 
“detected”  only  12  of  the  19  mines,  with  no  false  alarms.  However,  as  can  be  seen  by  the 
ROC  curve,  a  more  optimal  choice  of  would  result  in  the  detection  of  15  mines  with 
no  false  alarms. 

We  have  investigated  the  performance  of  several  simple  alternatives  to  this 
criterion.  Two  such  alternatives  are  shown  in  Figure  IV- 1.  One  of  these,  Psum,  is  just  the  P 
statistic  associated  with  the  sum  of  all  four  detectors.  Relative  to  P^,  this  statistic  gives 
insignificantly  better  performance  at  low  false-alarm  rates  and  marginally  worse  perform¬ 
ance  overall.  The  other  alternative,  N/Nbg,  is  just  the  ratio  of  the  sum  counts  to  the  sum 
background  counts.  It  is  analogous  to  Psum,  but  it  neglects  the  spectral  normalization  step. 
This  decision  criterion  gives  overall  performance  identical  to  Pmin  (in  the  sense  that  the 
areas  under  the  curves  are  the  same). 

That  these  alternative  decision  criteria  give  similar  results  should  be  interpreted  as 
evidence  that  the  demonstrated  level  of  performance  is  not  overly  sensitive  to  the 
particular  choice  of  algorithm.  Given  the  level  of  statistical  significance  of  this  test,  it 
would  be  unsettling  to  find  that  the  relatively  subtle  distinctions  that  we  have  explored 
here  make  a  significant  difference. 

Whichever  decision  criterion  one  adopts,  study  of  the  data  reveals  that  there  are 
three  particularly  difficult  targets:  an  M19  at  1.5-in.  depth,  a  TM62P  on  the  surface,  and  a 
TM46  at  2-in.  depth.  Figure  IV-2  shows  the  N/Nbg  signature  statistic  as  measured  in 
sequence.  The  blue  triangles  show  the  ground  truth.  Note  that  each  mine  in  question  is 
surrounded  by  mine-free  locations  that  have  nearly  identical  signatures;  it  is  difficult  to 
see  how  these  mines  could  be  “pulled  out”  of  the  background. 

3.  Depth 

Figure  IV-3  shows  a  plot  of  net/bckgnd  counts  versus  depth.  While  four  of 
the  surface  mines  had  very  strong  signatures,  the  signals  of  the  buried  mines  show  no 
dependence  on  depth.  Note,  however,  that  there  is  an  unknown  error  bar  on  each  depth; 
that  is,  each  mine  was  buried  at  the  approximate  depth  indicated,  but  the  exact  depth  of 
each  mine  is  not  known. 


IV-3 


4.  Nitrogen  Content 

Figure  IV-4  shows  the  dependence  of  net/bckgnd  counts  on  nitrogen 
content.  There  does  not  seem  to  be  any  significant  dependence  on  nitrogen  content. 
Table  IV-2  gives  our  computation  of  nitrogen  content  of  the  various  mine  types. 


1  1.5  2  2.5  3  3.5 

Nitrogen  content  (kg) 


Figure  IV-4.  Dependence  of  Performance  on  Nitrogen  Content 


Table  IV-2.  Nitrogen  Content  of  Various  Mine  Types 


Mine 

Designation 

Explosive 

Type 

Explosive  Weight 

(kg) 

Nitrogen  Mass 
Fraction 

Nitrogen  Weight 
(kg) 

TM62M 

TNT/RDX/ 

Alumimum 

7.0 

0.226 

1.58 

TM62P 

TNT 

7.2 

0.185 

1.332 

TMA4 

TNT 

5.5 

0.185 

1.018 

M19 

Comp  B 

9.53 

0.299 

2.851 

M15 

Comp  B 

10.3 

0.299 

3.081 

TM46 

TNT 

5.7 

0.185 

1.055 

5.  Variability  of  Background  Signature 

One  of  the  objectives  of  this  test  was  to  determine  the  effect  of  background 
variability.  In  this  section  we  limit  our  attention  to  the  test  locations  where  there  was  no 
mine  present.  We  address  the  question  of  whether  the  results  tell  us  anything  useful  about 
the  background. 
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We  first  look  at  the  raw  number  of  counts,  N,  as  a  function  of  the  sequence  in 
which  the  measurement  was  made.  We  choose  to  use  this  statistic  here  and  in  the 
following  discussion  because  the  expected  “statistical”  variation  of  this  quantity  is  widely 
understood  and  easy  to  compute  (square  root  of  N).  See  Figure  IV-5  to  verify  that  the 
variation  in  the  data  seems  somewhat  large  compared  to  the  error  bars  on  the  points 
(which  are  square  root  of  the  counts),  but  not  excessively  so.  If  we  look  at  the  ratio  of  the 
variance  to  the  mean  of  these  22  samples,  we  get  a  value  of  2.4;  this  is  significantly  larger 
than  the  value  of  1  that  is  expected  for  a  Poisson  process  with  a  well-defined  mean  value. 

In  fact,  the  variability  has  some  systematic  behavior;  there  is  a  long-term  trend  to 
higher  number  of  counts  with  time,  a  transition  at  the  fifth  measurement,  and  few  real 
outliers.  Thus  it  is  possible  that  the  periodic  background  measurements  and  recalibration 
account  for  much  of  this  variation.  This  is  indeed  the  case;  as  shown  in  Figure  IV-5,  the 
solid  line,  which  represents  the  background  measurements,  tracks  the  points  quite  well. 
Therefore,  we  expect  that  the  ratio  of  Net/Std  Net,  from  which  Ptot  is  derived,  to  be 
normally  distributed  with  zero  mean  and  unit  standard  deviation.  The  actual  values  are 
0.13  and  1.12,  well  within  expected  errors  of  the  nominal  values.  Thus,  the  CDC  back¬ 
ground  subtraction  protocol  successfully  removes  much  of  the  background  variance  in 
this  test. 

Additional  verification  that  the  background  variation  is  dominated  by  counting 
statistics  is  provided  by  the  lack  of  correlation  among  the  individual  y-ray  channels.  The 
cross-correlation  coefficients  are  shown  in  the  table  below;  all  are  consistent  with  a  lack 


of  correlation. 

Correl.  Coef. 

Channel  2 

Channel  3 

Channel  4 

Chan  1 

-0.25 

0.02 

0.07 

Chan  2 

0.15 

-0.10 

Chan  3 

0.08 

So  it  seems  that,  in  this  test,  the  background  and  recalibration  measurements  that 
are  conducted  as  part  of  the  CDC  measurement  protocol  successfully  mitigate  the 
variation  of  the  background.  The  central  question  then  becomes,  Is  the  variation  a 
variation  in  the  signature — that  is,  is  it  a  property  of  the  location — or  is  it  simple 
instrumental  drift?  The  original  test  protocol  specifically  addressed  this  question  in  the 
reproducibility  test.  Unfortunately,  this  part  of  the  reproducibility  test  was  not  performed. 
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Figure  IV-5.  Raw  Number  of  Counts  for  Measurements  at  Locations  without 
Targets.  The  points  represent  cued  locations,  and  refer  to  the  left  axis. 

The  line  represents  background  measurements,  and  refers  to  the 
right  axis.  The  axes  are  scaled  by  a  2:5  ratio,  corresponding 
to  the  relative  durations  of  the  measurements. 

6.  Variability  of  Target  Signature 

As  we  discussed  above,  the  response  of  the  TNA  system  to  the  background  seems 
to  be  under  control.  In  this  section  we  apply  similar  consideration  to  the  mines.  Unfortu¬ 
nately,  the  sample  size  was  much  smaller  than  desired,  and  the  variety  of  mine  types  was 
rather  large.  It  is  therefore  difficult  to  draw  statistically  valid  conclusions  based  on  con¬ 
trolled  variables.  What  data  we  have  are  represented  in  Figure  IV-6.  It  is  clear,  however, 
that  the  target  signatures  exhibit  a  large  variability  relative  to  that  of  the  background. 

In  Figure  IV-6,  the  closed  circles  show  the  mean  response  in  total  counts,  N,  by 
mine  type  and  depth.  Values  of  N  corresponding  to  the  symbols  should  be  read  off  the 
left-hand  axis.  The  mine  groups  are  labeled  by  mine  type  @  depth,  with  “S”  in  the  depth 
field  denoting  a  surface  mine.  (Note  that  “surface”  means  that  no  part  of  the  mine  was 
below  ground  level.)  The  mine  groups  are  arranged  along  the  horizontal  axis  with  the 
surface  mines  on  the  left,  the  buried  mines  next,  and  the  point  corresponding  to  the  22 
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Figure  IV-6.  Mean  Number  of  Counts  and  Variance-to-Mean  Ratio 
by  Mine  Type  and  Depth.  For  explanation  see  text. 

false  locations  on  the  right.  The  error  bars  on  these  points  indicate  plus  and  minus  one 
standard  deviation  of  the  set  of  targets.  Some  points  have  no  error  bars;  these  correspond 
to  “groups”  of  a  single  mine. 

Each  open  circle  represents  the  variance-to-mean  ratio  for  a  fixed  mine  type  at 
fixed  depth.  Cases  without  open  circles  are  samples  of  one,  where  no  variance  estimate 
exists.  Variance-to-mean  values  should  be  read  off  the  axis  on  the  right.  Recall  that  this 
ratio  is  expected  to  be  unity  for  processes  where  the  variability  is  dominated  by  Poisson 
counting  statistics.  Note  that  there  are  two  groups,  namely  Ml 9  @  1.5  in.  and  TM62M  @ 
4  in.,  that  have  quite  large  values  of  variance  to  mean.  It  is  possible  that  the  explanation 
for  this  lies  in  the  different  soil  properties  at  the  locations  of  the  mines. 

It  is  likely,  however,  that  some  other  source  of  variation  is  in  play.  The  evidence 
for  this  view  is  the  behavior  of  the  surface  mines.  Note  first  that  the  TM62P  @  S  mine 
was  indistinguishable  from  background  level  (defined  by  the  “false”  cues),  while  the 
buried  versions  of  the  same  mine  had  signals  somewhat  above  background.  Further,  note 
that  the  M 19  @  S  mine  had  a  much  smaller  signature  than  the  two  Ml 5  @  S — about  a 
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factor  of  four  in  counts  above  background — even  though  these  two  mine  types  are  fairly 
similar  in  terms  of  nitrogen  content. 

Based  on  the  variability  of  the  target  signatures,  especially  the  surface  targets,  and 
the  lack  of  variability  of  background  signature,  we  suspect  that  the  variability  in  target 
signature  may  be  due  to  extreme  sensitivity  to  the  placement  of  the  TNA  system  relative 
to  the  mine.  For  example,  the  path  from  mines  (especially  those  on  the  surface)  to  the 
gamma  detectors  may  be  partially  obstructed  by  the  shielding.  (This  seems  to  be  the  case 
based  on  line  drawings  of  the  CDC  sensor  head.)  It  is  possible  that  the  degree  of 
obstruction  of  the  mine  depends  critically  on  the  elevation  and  orientation  of  the 
detection  head.  Note  that  the  response  of  the  ground — an  extended,  homogeneous 
entity — would  not  be  so  sensitive  to  the  geometry.  It  is  also  likely  that  the  shape  of  the 
M19  (square)  plays  a  role  in  the  variability. 

7.  Soil  Content 

Approximately  2  weeks  before  the  TNA  tests  were  conducted,  soil  samples  were 
collected  roughly  every  15  m  along  the  test  lane.  Near  the  center  of  the  lane,  five  soil 
samples  were  taken  at  a  separation  distance  of  about  6  in.  (see  Figure  IV-7).  Table  IV-3 
summarizes  the  results  of  the  composition  analysis  in  the  test  lane. 

- 15  m  © 

04 - ►©  £  0  O  O 

0 


Figure  IV-7.  Soil  Sampling  Layout 

For  several  of  the  elements,  Figure  IV-8  compares  the  composition  by  weight 
averaged  over  the  test  lane  with  the  average  values  cited  for  Earth’s  crust.2  The  APG  test 
lane  contained  a  significantly  reduced  percentage  of  Al,  Ca,  Fe,  K,  Na,  and  Ti  compared 
to  Earth’s  crust  average  values.  Such  soil  conditions  work  in  favor  of  TNA  detection 
systems,  because  all  of  those  elements  contribute  to  pile-up,  particularly  through  radiative 
capture,  which  for  each  of  these  elements  produces  y-rays  of  at  least  7  MeV.3 


2  Available:  http://www.shef.ac.uk/chemistry/web-elements/index.html,  February  1999. 

3  Fe56  is  91.7  percent  of  naturally  occurring  Fe  and  has  a  capture  cross-section  of  2.81  b,  yielding  a 
7.65-MeV  y-ray;  Fe54  is  5.9  percent  of  naturally  occurring  Fe  and  has  a  capture  cross-section  of  2.16  b, 
yielding  a  9.30-MeV  y  -ray;  Al27  has  a  0.23  b  capture  cross-section,  yielding  a  7.73-MeV  y  -ray,  Ca40 
has  a  0.41  b  capture  cross-section,  yielding  an  8.36-MeV  y-ray;  K39  is  93.3  percent  of  naturally 
occurring  K  and  has  a  2.10  b  capture  cross-section,  yielding  a  7.80-MeV  y-ray;  K41  is  6.7  percent  of 
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Table  IV-3.  Percent  Composition  by  Weight 


Sample 

1 

2 

3 

4 

5-1 

5-2 

5-3 

5-4 

5-5 

6 

7 

8 

9 

10 

Aluminum 

1.1 

1.6 

1.2 

1.2 

1.4 

1.5 

1.3 

1.2 

1.3 

1.5 

1.8 

1.3 

1.3 

1.4 

Calcium 

1.3 

1.4 

1.2 

1.6 

1.7 

1.5 

1.5 

1.3 

1.8 

1.6 

1.6 

1.3 

1.4 

1.6 

Iron 

1.0 

1.1 

1.0 

1.2 

1.7 

1.3 

1.2 

1.2 

1.2 

1.6 

1.1 

1.3 

1.2 

1.3 

Potassium 

0.05 

0.07 

0.06 

0.05 

0.08 

0.07 

0.06 

0.07 

0.06 

0.07 

0.09 

0.07 

0.05 

0.06 

Silicon 

24 

23 

24 

25 

26 

20 

22 

23 

24 

21 

23 

24 

22 

24 

Sodium 

0.06 

0.08 

0.06 

0.07 

0.08 

0.08 

0.07 

0.07 

0.07 

0.07 

0.08 

0.07 

0.07 

0.08 

Titanium 

0.01 

0.02 

0.02 

0.01 

0.02 

0.02 

0.02 

0.02 

0.02 

0.03 

0.02 

0.02 

0.02 

0.02 

Nitrogen 

0.11 

0.12 

0.07 

0.14 

0.05 

0.44 

0.32 

0.19 

0.08 

0.31 

0.18 

0.04 

0.00 

0.00 

Carbon 

0.36 

0.25 

0.44 

0.17 

0.31 

0.10 

0.03 

0.22 

0.52 

0.25 

0.45 

1.26 

1.03 

0.39 

Figure  IV-9  compares  the  concentration  of  nitrogen  and  carbon  averaged  over  the 
test  lane  with  Earth’s  crust  average  values.  The  nitrogen  concentration  in  the  test  lane 
was  almost  two  orders  of  magnitude  greater  than  the  average  value  cited  for  Earth’s  crust, 
while  the  carbon  concentration  in  the  test  lane  was  about  a  factor  of  two  higher.  The 
higher  the  nitrogen  content,  the  greater  the  potential  for  problems  for  a  TNA  detection 
system,  particularly  if  there  is  a  great  degree  of  spatial  variability  of  the  nitrogen  content, 
which  appeared  to  be  the  case  for  this  test  lane,  as  seen  in  Table  IV-3.4  Although  there 
were  other  lanes  at  APG  that  were  found  to  have  essentially  no  nitrogen,  the  TNA  system 
was  not  tested  on  those  lanes,  so  it  is  not  possible  for  us  to  determine  whether  the  levels 
of  nitrogen  in  the  test  lane  had  any  significant  impact  on  the  performance  of  the  system.5 
In  the  future,  it  would  definitely  be  worthwhile  to  take  advantage  of  the  apparent 
variability  of  nitrogen  content  on  the  APG  site  to  test  the  sensitivity  of  the  TNA  detection 


naturally  occurring  K  and  has  a  1.46  b  capture  cross-section,  yielding  a  7.53-MeV  y-ray;  Na23  has  a 
0.53  b  capture  cross-section,  yielding  a  6.96-MeV  y-ray;  Ti48  is  73.8  percent  of  naturally  occurring  Ti 
and  has  a  7.84  b  capture  cross-section,  yielding  an  8.14-MeV  y-ray;  Ti46  is  8  percent  of  naturally 
occurring  Ti  and  has  a  0.60  b  capture  cross-section,  yielding  an  8.88-MeV  y-ray;  Ti47  is  7.3  percent  of 
naturally  occurring  Ti  and  has  a  1.70  b  capture  cross-section,  yielding  an  11.63-MeV  y-ray;  Ti49  is 
5.5  percent  of  naturally  occurring  Ti  and  has  a  2.21  b  capture  cross-section,  yielding  a  10.94-MeV 
y-ray;  and  Ti50  is  5.4  percent  of  naturally  occurring  Ti  and  has  a  0.18  b  capture  cross-section,  yielding  a 
6.37  MeV  y-ray. 

4  The  average  percent  composition  by  weight  of  nitrogen  over  the  lane  was  0.146  percent,  with  a 
standard  deviation  of  0.131  percent. 

5  Soil  samples  were  actually  collected  over  the  entire  site,  because  at  the  time  the  samples  were 
collected,  it  was  not  known  which  lanes  would  be  used  for  the  TNA  test.  It  was  found  that  there  were 
several  test  lanes  for  which  the  percent  composition  of  nitrogen  was  0.00. 
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Concentration,  PPM  by  weight 


Element 


Figure  IV-8.  Soil  Composition,  Lane  11  and  Earth’s  Crust  (avg): 
Aluminum,  Calcium,  Iron,  Potassium,  Silicon,  Sodium,  and  Titanium 
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Figure  IV-9.  Soil  Composition,  Lane  11  and  Earth’s  Crust  (avg): 
Nitrogen  and  Carbon 
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method  to  nitrogen  content  in  the  soil.  Carbon  content  in  the  soil  may  have  a  modest 
positive  impact  on  TNA  systems  because  it  will  moderate  the  neutrons  to  some  degree, 
and  it  has  a  very  low  absorption  cross-section.  The  fact  that  the  carbon  levels  in  the  test 
lane  were  about  a  factor  of  two  higher  than  the  average  values  quoted  for  Earth’s  crust 
may  have  helped  the  TNA  performance  to  some  degree. 

B.  SOCORRO 

All  of  the  19  mines  were  detected.  Of  the  19  false  locations,  6  (or  32  percent) 
were  declared  as  mines.  The  19  detections  included  1  case  where  CDC  indicated  a 
possible  hit  that  was  probably  outside  the  nominal  30-cm  range  of  the  system.  Never¬ 
theless,  after  the  marked  location  was  checked  by  hand  to  verify  that  it  was  within  5  cm 
of  the  intended  location,  the  report  was  scored  as  a  correct  detection. 
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V.  FINDINGS  AND  RECOMMENDATION 


A.  FINDINGS 

We  summarize  our  findings  as  follows: 

1.  Aberdeen 

•  PD  =  12/19,  PFA  =  0/22. 

•  ROC  curve  (reported  previously)  showed  PD  =  15/19  was  attainable  before 
the  first  false  alarm. 

•  Alternative  approaches  showed  little  sensitivity  of  the  ROC  curve  to  the 
details  of  the  sensor  reporting  algorithm. 

•  Surface  mines  were  detected  with  greater  certainty  than  buried  mines, 
although  one  surface  mine  gave  trouble. 

•  Detectability  of  buried  mines  showed  no  dependence  on  depth. 

•  Detectability  of  buried  mines  showed  no  dependence  on  mine  type  (or 
nitrogen  content). 

•  The  departure  from  near-perfect  performance  was  dominated  by  three 
mines — M19  at  1.5  in.,  TM62P  at  0  in.,  and  TM46  at  2  in. 

•  Target  signature  variability  played  a  larger  role  than  background  variability. 

•  It  is  possible  that  the  square  shape  of  an  M19  renders  it  more  difficult  to 
detect.  An  M 19  on  the  surface  yielded  the  second  weakest  signal  of  all  the 
surface  mines;  its  signal  was  even  weaker  than  those  of  a  TM62M  and  a 
TMA4,  both  of  which  contain  significantly  less  nitrogen  than  an  M19. 
Furthermore,  during  a  separate  reproducibility  test,  an  M19  at  1.5  in.  twice 
yielded  a  significantly  weaker  signal  than  an  M 15  at  1.5  in.,  even  though  the 
M15  contains  only  10  percent  more  nitrogen  than  an  Ml 9. 

•  The  variability  of  mines  on  the  surface  was  particularly  troubling. 

•  Test  execution  shortfalls  preclude  resolution  of  key  issues. 

2.  Socorro 

•  PD  =  19/19,  PFA  =  6/19. 

•  Test  execution  shortfalls  preclude  further  analysis. 
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B.  RECOMMENDATION 

The  signature  variability  that  is  seen  in  this  data  is  unique.  The  background 
phenomenology,  usually  the  performance  driver,  seems  to  be  under  control,  at  least  at  this 
particular  location  (Aberdeen).  The  target  signatures,  however,  which  are  usually  well 
modeled  and  fairly  reproducible,  are  extremely  variable.  That  significant  variability  is  in 
a  set  of  mines  on  the  surface  is  particularly  bewildering. 

Unfortunately,  aspects  of  the  test  that  were  intended  to  shed  light  on  exactly  these 
issues  were  not  completed.  It  is  of  interest  to  the  Army  to  determine  whether  these  issues 
are  subject  to  amelioration  by  means  of  engineering  expedients  or  represent  fundamental 
obstacles  to  future  improvement.  We  recommend  that  resolution  of  these  issues  by 
further  testing  precede  further  development  of  this  system.  Appendix  A  describes  such  a 
test. 
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TEST  PLAN 
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APPENDIX  A 
TEST  PLAN 


The  paragraphs  that  follow  comprise  the  original  test  plan  for  this  neutron 
activiation  study.  It  differs  substantially  from  what  was  actually  accomplished  in  the 
field;  for  that  information  see  the  main  text.  It  is  included  to  clarify  the  intent  of  the 
various  phases  of  the  test. 

There  are  four  objectives  of  this  experiment,  each  of  which  is  addressed  in  a 
separate  phase.  The  Phase  1  activities  comprise  a  set  of  soil  assays  that  do  not  require  the 
availability  of  the  mine  detection  system  or  its  developers.  The  other  three  phases  consist 
of  a  series  of  2-minute  “measurements”  by  the  mine  detection  system  at  designated  (i.e., 
flagged)  measurement  locations. 

PHASE  1:  SITE  CHARACTERIZATION 

The  question  of  how  much  variability  in  neutron  activation  signature  is  to  be 
expected  on  a  given  site  has  not  yet  been  addressed.  This  goes  to  the  important  question 
of  what  is  possible  given  optimal  performance  of  the  mine  detection  system.  For 
example,  soil  nitrogen  content  is  itself  highly  variable  on  a  global  scale;  local  variations 
will  be  an  uncontrollable  system  driver,  as  will  silica  density.  A  measurement  of  site 
composition  will  be  conducted  to  enable  modeling  of  “ideal”  system  performance. 

The  soil  sampling  protocol  is  as  follows:  One  hundred  sampling  sites  will  be 
uniformly  distributed  in  distance  along  test  lanes.  At  every  10th  site,  5  samples  will  be 
taken  at  the  vertices  of  a  pentagon  roughly  6  in.  on  a  side;  at  each  of  the  other  sites  only 
one  sample  will  be  taken  (see  Figure  A-l).  For  each  sample,  the  location,  volume,  and 
weight  will  be  recorded  on  site,  and  the  sample  bagged.  Then  the  samples  will  be  shipped 
to  the  analysis  facility,  where  each  will  be  analyzed  for  water,  nitrogen,  silicon,  iron, 
boron,  hydrogen,  aluminum,  calcium,  sodium,  potassium,  titanium,  and  gadolinium. 
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Lane  1, 

Lane  1, 

Lane  1, 

Lane  1, 

Lane  1, 

Lane  1, 

Lane  1, 

Lane  1, 

Figure  A-1.  Illustration  of  Sample  Spacing  for  the 
Phase  1  Site  Characterization  Study 

PHASE  2:  REPRODUCIBILITY 

Previous  experience  with  tests  of  neutron  activation  systems  make  reproducibility 
study  an  explicit  requirement  of  a  successful  test.  Without  an  understanding  of  the 
reproducibility  of  measurements,  none  of  the  other  measurements  are  meaningful.  Both 
short-  and  long-term  stability  will  be  assessed  in  Phase  2. 

The  reproducibility  will  be  assessed  early  in  the  test,  with  ongoing  spot  checks  to 
monitor  possible  drift.  The  initial  assessment  will  consist  of  five  repetitions  of  measure¬ 
ments  on  a  series  of  four  test  locations  (see  Figure  A-2).  Two  of  the  test  locations  will 
have  buried  mines  (or  surrogate  targets  of  melamine  [C3N3(NH2)3]  nominally  sized  for  a 
2-kg  nitrogen  content).  The  mine-detection  system  measurements  will  be  made  serially; 
that  is,  the  system  must  move  and  be  realigned  between  measurements  of  the  same 
location  so  that  the  uncertainties  associated  with  alignment  are  incorporated  in  the 
reproducibility  study.  Subsequently,  at  roughly  2-hour  intervals,  the  reproducibility  site 
will  be  revisited  for  single  passes  over  each  of  the  four  test  locations  to  assess  longer  term 
reproducibility. 


Site  1 
Site  2 
Site  3 
Site  4 
Site  5 
Site  6 
Site  7 
Site  8 


I 


15  meters 


15  cm 
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Figure  A-2.  Illustration  of  Laydown  for  Reproducibility  Study. 

The  X’s  denote  flagged  locations  with  buried  mines,  the  O’s 
flagged  locations  without  mines.  The  separation  between 
the  locations  needs  to  be  greater  than  3  m. 

PHASE  3:  SPATIAL  RESPONSE 

Neutron  activation  is  not  envisioned  as  a  tool  for  wide-area  search,  but  rather  as  a 
confirmatory  technique.  Thus,  the  neutron  activiation  system  will  be  cued  by  other 
sensors.  The  spatial  accuracy  of  the  cue  must  therefore  be  well  matched  to  the  spatial 
response  of  the  TNA  system.  In  Phase  3  the  spatial  response  function  of  the  detection 
system  will  be  determined. 

The  test  array  is  a  series  of  “columns”  of  test  locations  (see  Figure  A-3). 
Nominally,  each  column  is  a  set  of  four  test  locations.  Typically,  one  of  these  is  a 
surrogate  target,  and  the  other  three  are  at  various  distances  from  the  target,  although 
occasionally  a  column  does  not  contain  a  surrogate  at  all.  Again,  a  melamine  target 
surrogate  may  be  used.  The  purpose  of  the  “column”  configuration  is  to  save  time  when 
evaluating  the  spatial  response  profile.  The  arrangement  allows  the  four  sites  per  column 
to  be  aligned  using  the  lateral  play  of  the  system  without  moving  the  vehicle. 

The  initial  test  will  evaluate  the  spatial  response  over  an  array  of  five  columns. 
On-site  analysis  will  determine  whether  and  how  additional  measurements  will  be 
needed.  Depending  on  the  behavior  of  the  system,  up  to  25  columns  may  need  to  be 
evaluated. 
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Figure  A-3.  Illustration  of  Test  Columns  for  Spatial  Response  Test. 
X’s  and  O’s  denote  flagged  locations  with  and  without 
buried  mine  surrogates,  respectively. 


PHASE  4:  RECEIVER  OPERATING  CHARACTERISTIC 

The  test  site  will  include  mines  of  various  types,  buried  at  different  depths  (see 
Figure  A-4).  The  site  will  also  include  a  number  of  sites  with  no  mine  present.  The  ROC 
curves  will  be  done  by  mine  type/depth,  and  various  combinations.  The  full  test  will 
require  about  160  measurements.  Limiting  the  test  to  a  subset  corresponding  to  the  plastic 
casings  would  pare  the  number  to  about  80.  It  is  important  that  if  the  test  is  truncated,  the 
number  of  mine  types  must  be  limited;  the  number  per  type  is  already  so  small  as  to  be  a 
problem. 


x _ o _ ° 

Figure  A-4.  Illustration  of  Laydown  for  ROC  Study.  The  X’s  denote  flagged 
locations  with  buried  mines,  the  O’s  flagged  locations  without  mines. 

REPORTING  REQUIREMENT 

To  enable  trade  studies  that  will  assess  costs  associated  with  leakage  and  false 
alarms,  the  test  methodology  will  focus  on  generation  of  a  ROC  curve.  This  means  that 
simple  binary  reports  are  inadequate.  Measurement  reports  must  be  in  a  continuous 
variable  that  is  monotonic  in  the  likelihood  that  a  mine  is  present,  which  enables  setting 
the  decision  threshold  off-line.  (We  are  being  deliberately  vague  about  the  precise 
meaning  of  the  likelihood  variable  that  will  be  reported,  so  as  not  to  limit  the  options  for 
the  system  developers.  For  example,  the  likelihood  variable  might  be  counts  above  back¬ 
ground  in  some  energy  range.) 
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The  likelihood  statistic  should  be  generated  autonomously  (that  is,  without  human 
intervention)  at  the  end  of  a  fixed  counting  time  (nominally,  2  minutes  for  this  test). 

To  assess  the  validity  of  efforts  to  model  TNA  system  performance,  the  gamma 
spectrum  at  each  designated  measurement  site  will  be  required. 

SCHEDULE 

The  test  schedule  needs  to  balance  the  various  objectives  and  minimize  the  risk 
that  any  objective  is  neglected.  All  of  the  objectives  are  important;  “good  enough”  on  all 
four  objectives  is  preferable  to  failure  on  any. 

The  schedule  therefore  begins  with  an  hour  of  the  reproducibility  phase;  unless 
there  are  reproducibility  problems  this  should  be  enough  to  quantify  the  variances.  Next, 
an  hour  to  an  hour  and  a  half  of  spatial  response  study  should  suffice  to  make  quantitative 
estimates  of  a  response  function  and  determine  the  need  for  additional  data.  Meanwhile, 
the  ROC  determination  can  proceed  on  the  priority  subset  of  test  sites  (color-coded 
flagging  is  suggested  for  first  pass/second  pass  measurements). 

Phase  1  is  independent  of  the  mine  detection  system  and  can  therefore  be 
scheduled  independently.  Phases  2,  3,  and  4  all  exercise  the  detection  system,  so  they 
need  to  be  scheduled  serially.  The  schedule  will  interleave  the  phases  to  optimize  value 
added,  given  the  possibility  that  time  or  equipment  constraints  may  mandate  partial 
completion  of  the  baseline  test. 
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APPENDIX  B 
ABERDEEN  DATA 


Table  B-l  is  a  summary  table  containing  the  relevant  data  from  Aberdeen  Proving 
Ground.  It  has  been  sorted  by  Pmjn,  which  is  the  statistic  used  by  CDC  to  determine 
whether  there  was  a  mine  present.  The  “Truth”  column  contains  a  1  if  the  target  was  a 
mine  and  a  zero  otherwise.  “Net  Dl”  refers  to  the  net  counts  in  detector  1,  and  so  on  for 
the  other  detectors,  while  “Bek  Dl”  refers  to  the  background  counts  in  detector  1,  and  so 
on  for  the  other  detectors.  “Pa  Dl”  is  the  value  of  the  P-statistic  for  detector  1,  and  so  on 
for  the  other  detectors. 

Figures  based  on  this  data  follow  Table  B-l.  Figures  B-l(a)-(d)  present  the  Net 
counts/Background  counts  in  each  detector  versus  targets  arranged  sequentially  in  time, 
while  Figure  B-l(e)  presents  the  Net/Background  counts  summed  over  all  detectors 
versus  target  number.  The  filled-in  circles  represent  those  targets  that  were  actually 
mines.  The  mines  that  yielded  the  lowest  signatures  correspond  to  target  numbers  311, 
320,  and  314.  Note  that  there  is  nothing  to  distinguish  these  points  from  the  nearby 
background  measurements.  Figures  B-2(a)-(d)  present  the  Net  counts/Background  counts 
in  each  detector  versus  the  rank  ordering  by  Pmin,  and  Figure  B-2(e)  presents  the  Net/ 
Background  counts  summed  over  all  detectors  versus  the  Pmin  rank  ordering.  The  filled-in 
circles  once  again  represent  those  targets  that  were  actually  mines. 

In  Figures  B-3(a)-(f)  we  explore  the  correlations  of  the  P-statistics  in  the  various 
detectors.  The  correlations  are  not  strong;  however,  when  the  correlations  of  the  net-to- 
background  counts  in  the  various  detectors  are  examined,  as  in  Figures  B-4(a)-(f),  the 
correlations  are  much  stronger. 
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Table  B-1.  APG  Summary  Table 
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